36 research outputs found

    Hybridization expansion Monte Carlo simulation of multi-orbital quantum impurity problems: matrix product formalism and improved sampling

    Get PDF
    We explore two complementary modifications of the hybridization-expansion continuous-time Monte Carlo method, aiming at large multi-orbital quantum impurity problems. One idea is to compute the imaginary-time propagation using a matrix product state representation. We show that bond dimensions considerably smaller than the dimension of the Hilbert space are sufficient to obtain accurate results and that this approach scales polynomially, rather than exponentially with the number of orbitals. Based on scaling analyses, we conclude that a matrix product state implementation will outperform the exact-diagonalization based method for quantum impurity problems with more than 12 orbitals. The second idea is an improved Monte Carlo sampling scheme which is applicable to all variants of the hybridization expansion method. We show that this so-called sliding window sampling scheme speeds up the simulation by at least an order of magnitude for a broad range of model parameters, with the largest improvements at low temperature

    Matrix Product State applications for the ALPS project

    Full text link
    The density-matrix renormalization group method has become a standard computational approach to the low-energy physics as well as dynamics of low-dimensional quantum systems. In this paper, we present a new set of applications, available as part of the ALPS package, that provide an efficient and flexible implementation of these methods based on a matrix-product state (MPS) representation. Our applications implement, within the same framework, algorithms to variationally find the ground state and low-lying excited states as well as simulate the time evolution of arbitrary one-dimensional and two-dimensional models. Implementing the conservation of quantum numbers for generic Abelian symmetries, we achieve performance competitive with the best codes in the community. Example results are provided for (i) a model of itinerant fermions in one dimension and (ii) a model of quantum magnetism.Comment: 11+5 pages, 8 figures, 2 example

    Corpus Conversion Service: A Machine Learning Platform to Ingest Documents at Scale

    Full text link
    Over the past few decades, the amount of scientific articles and technical literature has increased exponentially in size. Consequently, there is a great need for systems that can ingest these documents at scale and make the contained knowledge discoverable. Unfortunately, both the format of these documents (e.g. the PDF format or bitmap images) as well as the presentation of the data (e.g. complex tables) make the extraction of qualitative and quantitive data extremely challenging. In this paper, we present a modular, cloud-based platform to ingest documents at scale. This platform, called the Corpus Conversion Service (CCS), implements a pipeline which allows users to parse and annotate documents (i.e. collect ground-truth), train machine-learning classification algorithms and ultimately convert any type of PDF or bitmap-documents to a structured content representation format. We will show that each of the modules is scalable due to an asynchronous microservice architecture and can therefore handle massive amounts of documents. Furthermore, we will show that our capability to gather ground-truth is accelerated by machine-learning algorithms by at least one order of magnitude. This allows us to both gather large amounts of ground-truth in very little time and obtain very good precision/recall metrics in the range of 99\% with regard to content conversion to structured output. The CCS platform is currently deployed on IBM internal infrastructure and serving more than 250 active users for knowledge-engineering project engagements.Comment: Accepted paper at KDD 2018 conferenc

    ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents

    Full text link
    Transforming documents into machine-processable representations is a challenging task due to their complex structures and variability in formats. Recovering the layout structure and content from PDF files or scanned material has remained a key problem for decades. ICDAR has a long tradition in hosting competitions to benchmark the state-of-the-art and encourage the development of novel solutions to document layout understanding. In this report, we present the results of our \textit{ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents}, which posed the challenge to accurately segment the page layout in a broad range of document styles and domains, including corporate reports, technical literature and patents. To raise the bar over previous competitions, we engineered a hard competition dataset and proposed the recent DocLayNet dataset for training. We recorded 45 team registrations and received official submissions from 21 teams. In the presented solutions, we recognize interesting combinations of recent computer vision models, data augmentation strategies and ensemble methods to achieve remarkable accuracy in the task we posed. A clear trend towards adoption of vision-transformer based methods is evident. The results demonstrate substantial progress towards achieving robust and highly generalizing methods for document layout understanding.Comment: ICDAR 2023, 10 pages, 4 figure
    corecore